About Certain Semantic Annotation in Parallel Corpora
نویسندگان
چکیده
منابع مشابه
QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages
This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference. The corpora comprise both the well-known Europarl corpus and a domain-specific question-answer troubleshooting corpus on the IT domain. English is comm...
متن کاملSemantic Typology and Parallel Corpora: Something about Indefinite Pronouns
Patterns of crosslinguistic variation in the expression of word meaning are informative about semantic organization, but most methods to study this are labor intensive and obscure the gradient nature of concepts. We propose an automatic method for extracting crosslinguistic co-categorization patterns from parallel texts, and explore the properties of the data as a potential source for automatic...
متن کاملSemantic Annotation of Multilingual Text Corpora
This paper describes a multi-site project to annotate six sizable bilingual parallel corpora for interlingual content. After presenting the background and objectives of the effort, we describe the data set that is being annotated, the interlingua representation language used, an interface environment that supports the annotation task and the annotation process itself. We then present our evalua...
متن کاملSemantic annotation of French corpora: animacy and verb semantic classes
This paper presents a first corpus of French annotated for animacy and for verb semantic classes. The resource consists of 1,346 sentences extracted from three different corpora: the French Treebank (Abeillé and Barrier, 2004), the Est-Républicain corpus (CNRTL) and the ESTER corpus (ELRA). It is a set of parsed sentences, containing a verbal head subcategorizing two complements, with annotatio...
متن کاملShallow Semantic Annotation of Biomedical Corpora for Information Extraction
Work over the last few years in literature data mining for biology has progressed from linguistically unsophisticated models to the adaptation of Natural Language Processing (NLP) techniques that use full parsers ([11, 16]) and coreference to extract relations that span multiple sentences ([12, 6]) (For an overview, see [7]). However, there has been a lack of annotated corpora that can fuel fur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cognitive Studies | Études cognitives
سال: 2015
ISSN: 2392-2397
DOI: 10.11649/cs.2013.004